Latent Dirichlet Allocation
نویسندگان
چکیده
We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspect model , also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.
منابع مشابه
Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کامل以狄式分佈為基礎之多語聲學模型拆分及合併 (Multilingual Acoustic Model Splitting and Merging by Latent Dirichlet Allocation) [In Chinese]
To avoid the confusion of phonetic acoustic models between different languages is one of the most challenges in multilingual speech recognition. We proposed the method based on Latent Dirichlet Allocation to avoid the confusion of phonetic acoustic models between different languages. We split phonetic acoustic models based on tri-phone. And merging the group that selected by Latent Dirichlet Al...
متن کاملDistributed Latent Dirichlet Allocation via Tensor Factorization
We describe a distributed implementation for Latent Dirichlet Allocation parameter estimation based upon the method of moments.
متن کاملHyperspectral Unmixing with Endmember Variability using Semi-supervised Partial Membership Latent Dirichlet Allocation
A semi-supervised Partial Membership Latent Dirichlet Allocation approach is developed for hyperspectral unmixing and endmember estimation while accounting for spectral variability and spatial information. Partial Membership Latent Dirichlet Allocation is an effective approach for spectral unmixing while representing spectral variability and leveraging spatial information. In this work, we exte...
متن کاملExperiments with Latent Dirichlet Allocation
Latent Dirichlet Allocation is a generative topic model for text. In this report, we implement collapsed Gibbs sampling to learn the topic model. We test our implementation on two data sets: classic400 and Psychological Abstract Review. We also discuss the different evaluation of goodness-of-fit of the models how parameter settings interact with the goodness-of-fit.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 3 شماره
صفحات -
تاریخ انتشار 2003